Skip to content

feat: add euclidean one2many#188

Open
richyreachy wants to merge 17 commits intomainfrom
feat/euclidean_one2many
Open

feat: add euclidean one2many#188
richyreachy wants to merge 17 commits intomainfrom
feat/euclidean_one2many

Conversation

@richyreachy
Copy link
Collaborator

@richyreachy richyreachy commented Mar 1, 2026

add euclidean one2many

Greptile Summary

This PR adds SIMD-accelerated Euclidean (L2) one-to-many distance computation to the zvec batch distance infrastructure, mirroring the existing inner-product and cosine batch distance paths. It introduces new dispatch and implementation files for FP32/FP16 across AVX2, AVX-512F, and AVX-512FP16 instruction sets, wires them into BaseDistance::ComputeBatch via if constexpr, and adds EuclideanMetric::batch_distance().

Key issues found:

  • Wrong distance formula in euclidean_distance_batch_impl_fp32_avx512.cc (lines 66–69): The tail-element path uses _mm512_mask3_fmadd_ps(q, ptrs[i], accs[i], mask), which accumulates an inner product (q·d) instead of the squared difference (q−d)². Dimensions not a multiple of 16 will return completely incorrect distances on AVX-512F hardware.

  • Wrong distance formula in euclidean_distance_batch_impl_fp16_avx512.cc (line 77): The intermediate chunk for 16–31 remaining FP16 elements uses _mm512_fmadd_ps(q, data_regs[i], accs[i]) (inner product) rather than subfmadd(diff, diff, ...). Dimensions in the range (16, 32) modulo 32 are affected.

  • Unrelated regression — HammingMetric::batch_distance() removed (hamming_metric.cc): The method override is deleted without replacement. The base-class default returns nullptr, silently breaking any caller that invokes batch distance for binary data through a HammingMetric. The related DT_BINARY32/DT_BINARY64 cases were also dropped from SquaredEuclideanMetric::batch_distance() in euclidean_metric.cc.

  • Tests silenced with #if 0 (hnsw_streamer_test.cc): TestBinaryConverter and TestBasicRefiner are unconditionally disabled, masking the regressions above. These should be fixed rather than suppressed.

Confidence Score: 1/5

  • Not safe to merge — multiple critical logic errors produce incorrect Euclidean distances, and unrelated binary metric regressions break existing functionality.
  • Two separate SIMD implementation files contain inner-product accumulation where squared-difference accumulation is required, giving silently wrong distance values for non-full-register-width inputs. Additionally, the removal of HammingMetric::batch_distance() and binary-type cases in SquaredEuclideanMetric appears to be an unintended side-effect, and the two disabled tests hide these regressions from CI.
  • euclidean_distance_batch_impl_fp32_avx512.cc, euclidean_distance_batch_impl_fp16_avx512.cc, euclidean_distance_batch_impl_fp32_avx2.cc, hamming_metric.cc, euclidean_metric.cc

Important Files Changed

Filename Overview
src/ailego/math_batch/euclidean_distance_batch_impl_fp32_avx512.cc AVX-512F FP32 tail handling uses inner-product fmadd instead of squared-difference accumulation — produces completely wrong distances for non-multiples-of-16 dimensions.
src/ailego/math_batch/euclidean_distance_batch_impl_fp16_avx512.cc The 16–31 remaining-elements path computes inner product (fmadd_ps(q, data, acc)) instead of squared difference; the rest of the implementation is correct.
src/ailego/math_batch/euclidean_distance_batch_impl_fp32_avx2.cc Contains the previously-noted syntax error ptrs[i]dim+[3] on line 74 (compile failure); the main AVX2 loop is correctly implemented using sub + fmadd.
src/ailego/math_batch/euclidean_distance_batch.h Defines the dispatch structs and fallback for SquaredEuclideanDistanceBatch / EuclideanDistanceBatch; the EuclideanDistanceBatch correctly applies sqrt after the squared-distance computation.
src/ailego/math_batch/euclidean_distance_batch_dispatch.cc CPU-feature dispatch correctly selects the best available SIMD kernel at runtime for FP32 and FP16 types; int8 path left commented out.
src/core/metric/euclidean_metric.cc Adds EuclideanMetric::batch_distance() (correct); also removes binary-type cases from SquaredEuclideanMetric::batch_distance(), which is an unintended regression.
src/core/metric/hamming_metric.cc Removes HammingMetric::batch_distance() entirely; the base class returns nullptr, breaking batch binary distance computation.
tests/core/algorithm/hnsw/hnsw_streamer_test.cc Two tests (TestBinaryConverter, TestBasicRefiner) are disabled with #if 0, hiding potential regressions from the binary-metric removals.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["BaseDistance::ComputeBatch()"] --> B{DistanceType?}
    B -- "EuclideanDistanceMatrix" --> C["EuclideanDistanceBatch::ComputeBatch()"]
    B -- "SquaredEuclideanDistanceMatrix" --> D["SquaredEuclideanDistanceBatch::ComputeBatch()"]
    B -- "other" --> E["_ComputeBatch() fallback"]

    C --> F["SquaredEuclideanDistanceBatch::ComputeBatch()"]
    F --> G["sqrt(results[i]) for each i"]

    D --> H["SquaredEuclideanDistanceBatchImpl::compute_one_to_many()"]

    H --> I{CPU feature?}
    I -- "AVX-512FP16 (FP16 only)" --> J["avx512fp16_fp16 impl\n✅ correct"]
    I -- "AVX-512F (FP16)" --> K["avx512f_fp16 impl\n⚠️ line 77: inner product bug"]
    I -- "AVX-512F (FP32)" --> L["avx512f_fp32 impl\n⚠️ lines 66-69: inner product bug"]
    I -- "AVX2 (FP16)" --> M["avx2_fp16 impl\n✅ correct"]
    I -- "AVX2 (FP32)" --> N["avx2_fp32 impl\n⚠️ line 74: syntax error"]
    I -- "none" --> O["fallback scalar impl\n✅ correct"]
Loading

Comments Outside Diff (1)

  1. tests/core/algorithm/hnsw/hnsw_streamer_test.cc, line 3500 (link)

    P2 Tests disabled with #if 0

    TestBinaryConverter and TestBasicRefiner are both silenced with #if 0 in this PR. Permanently disabling tests with preprocessor guards makes failures invisible and can hide regressions. If these tests are failing due to the binary-type removals in hamming_metric.cc / euclidean_metric.cc, the root cause should be fixed rather than the tests suppressed. If the tests are being temporarily skipped for another reason, using DISABLED_ as a Google Test prefix is the conventional way to do this without removing coverage entirely.

Last reviewed commit: "fix: fix format"

Greptile also left 1 inline comment on this PR.

@richyreachy richyreachy requested a review from iaojnh March 1, 2026 14:47
@greptile-apps
Copy link

greptile-apps bot commented Mar 1, 2026

Greptile Summary

This PR adds Euclidean one-to-many distance computation with SIMD optimizations (AVX2, AVX512F, AVX512FP16) for float32, float16, and int8 data types. However, multiple critical bugs exist in the new implementation files that will cause incorrect distance calculations:

  • euclidean_distance_batch_impl.h: Wrong FMA operation in AVX512F masked tail (line 81) computes dot product instead of squared difference; AVX2 switch statement (lines 128-148) missing dimension offsets
  • euclidean_distance_batch_impl_fp16.h: AVX512F implementation (line 139) computes dot product instead of squared distance
  • euclidean_distance_batch_impl_int8.h: Incorrect pointer arithmetic (line 74) and completely broken switch statement (lines 93-136) with wrong macro usage and array indices

These bugs were already identified in previous review threads and remain unfixed. Additionally:

  • Removed incorrect Hamming distance cases from SquaredEuclideanMetric::batch_distance()
  • Added batch_distance() method to EuclideanMetric
  • Minor formatting improvements across several files

Confidence Score: 0/5

  • This PR is NOT safe to merge - contains multiple critical bugs that will produce incorrect results
  • Score of 0 reflects multiple critical algorithmic errors in the core distance computation logic that will cause incorrect calculations across all three implementation files (fp32, fp16, int8). These bugs were previously identified but remain unfixed, and will result in wrong distance values being returned to callers.
  • All three implementation files require immediate attention: euclidean_distance_batch_impl.h, euclidean_distance_batch_impl_fp16.h, and euclidean_distance_batch_impl_int8.h

Important Files Changed

Filename Overview
src/ailego/math_batch/euclidean_distance_batch_impl.h New file with critical bugs in AVX512F masked tail handling (wrong FMA operation) and AVX2 switch statement (missing dim offsets)
src/ailego/math_batch/euclidean_distance_batch_impl_fp16.h New file with critical bug in AVX512F implementation (dot product instead of squared distance at line 139)
src/ailego/math_batch/euclidean_distance_batch_impl_int8.h New file with critical bugs in pointer arithmetic (line 74) and switch statement (lines 91-136: wrong macro usage with pointers and incorrect array indices)
src/ailego/math_batch/euclidean_distance_batch.h New file defining batch distance computation structure, looks correct but depends on buggy implementation files
src/ailego/math_batch/distance_batch.h Added Euclidean and SquaredEuclidean distance batch handlers using constexpr type checks
src/core/metric/euclidean_metric.cc Added batch_distance method to EuclideanMetric and removed incorrect Hamming distance cases from SquaredEuclideanMetric

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[BaseDistance::ComputeBatch] --> B{Distance Type?}
    B -->|Euclidean| C[EuclideanDistanceBatch::ComputeBatch]
    B -->|SquaredEuclidean| D[SquaredEuclideanDistanceBatch::ComputeBatch]
    B -->|Other| E[Default _ComputeBatch]
    
    D --> F{Data Type?}
    F -->|float| G[SquaredEuclideanDistanceBatchImpl float]
    F -->|int8_t| H[SquaredEuclideanDistanceBatchImpl int8_t]
    F -->|Float16| I[SquaredEuclideanDistanceBatchImpl Float16]
    F -->|Other| J[Fallback Implementation]
    
    G --> K{CPU Features?}
    K -->|AVX512F| L[compute_one_to_many_squared_euclidean_avx512f_fp32]
    K -->|AVX2| M[compute_one_to_many_squared_euclidean_avx2_fp32]
    K -->|None| J
    
    H --> N{CPU Features?}
    N -->|AVX2| O[compute_one_to_many_squared_euclidean_avx2_int8]
    N -->|None| J
    
    I --> P{CPU Features?}
    P -->|AVX512FP16| Q[compute_one_to_many_squared_euclidean_avx512fp16_fp16]
    P -->|AVX512F| R[compute_one_to_many_squared_euclidean_avx512f_fp16]
    P -->|AVX2| S[compute_one_to_many_squared_euclidean_avx2_fp16]
    P -->|None| J
    
    C --> T[Call SquaredEuclideanDistanceBatch]
    T --> U[Apply sqrt to results]
    
    style L fill:#f99,stroke:#333,stroke-width:2px
    style M fill:#f99,stroke:#333,stroke-width:2px
    style O fill:#f99,stroke:#333,stroke-width:2px
    style Q fill:#f99,stroke:#333,stroke-width:2px
    style R fill:#f99,stroke:#333,stroke-width:2px
    style S fill:#f99,stroke:#333,stroke-width:2px
Loading

Last reviewed commit: b766fb1

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

13 files reviewed, 5 comments

Edit Code Review Agent Settings | Greptile

@richyreachy
Copy link
Collaborator Author

@greptile

@richyreachy
Copy link
Collaborator Author

@greptile

@@ -189,22 +189,6 @@ class HammingMetric : public IndexMetric {
return nullptr;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 HammingMetric::batch_distance() removed — now always returns nullptr

The PR removes the entire batch_distance() override from HammingMetric. The base class IndexMetric::batch_distance() returns nullptr by default (see src/include/zvec/core/framework/index_metric.h:85), so callers that rely on this method to get a valid batch-distance function for binary data will now silently receive a null function pointer, potentially causing null-pointer dereferences at runtime.

Additionally, the same PR removes the DT_BINARY32 and DT_BINARY64 cases from SquaredEuclideanMetric::batch_distance() in euclidean_metric.cc (lines 853–862 before this change). Both removals appear unrelated to the stated goal of adding Euclidean one-to-many and should be reverted unless there is an explicit intent to drop batch-distance support for binary metrics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants